Exploring Wikipedia's Category Graph for Query Classification
نویسندگان
چکیده
Wikipedia’s category graph is a network of 400,000 interconnected category labels, and can be a powerful resource for many classification tasks. However, its size and the lack of order can make it difficult to navigate. In this paper, we present a new algorithm to efficiently explore this graph and discover accurate classification labels. We implement our algorithm as the core of a query classification system and demonstrate its reliability using the KDD CUP 2005 competition as a benchmark.
منابع مشابه
Extending a multilingual Lexical Resource by bootstrapping Named Entity Classification using Wikipedia's Category System
Named Entity Recognition and Classification (NERC) is a well-studied NLP task which is typically approached using machine learning algorithms that rely on training data whose creation usually is expensive. The high costs result in the lack of NERC training data for many languages. An approach to create a multilingual NE corpus was presented in Wentland et al. (2008). The resulting resource call...
متن کاملWikipedia as an Ontology for Describing Documents
Identifying topics and concepts associated with a set of documents is a task common to many applications. It can help in the annotation and categorization of documents and be used to model a person's current interests for improving search results, business intelligence or selecting appropriate advertisements. One approach is to associate a document with a set of topics selected from a fixed ont...
متن کاملQEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches
A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...
متن کاملExploring Large RDF Datasets using a Faceted Search
We propose a facet-based RDF data exploration mechanism that lets the user visualize large RDF datasets by successively refining a query. The novel aspects of our work are: i) the SPARQL query pattern is visualized as a query graph, ii) the successive refinements are visualized in a query refinement graph, and iii) the result triples are visualized as a result RDF graph. The scheme is scalable ...
متن کاملCoupling Materialized View Selection to Multi Query Optimization: Hyper Graph Approach
Materialized views are queries whose results are stored and maintained in order to facilitate access to data in their underlying base tables of extremely large databases. Selecting the best materialized views for a given query workload is a hard problem. Studies on view selection have considered sharing common sub expressions and other multi-query optimization techniques. Multi-Query Optimizati...
متن کامل